Skip to main content

Overview

The bulk_market_analyzer.py script is the primary processing engine of the ChartsMaze EDL Pipeline. It combines fundamental data, technical indicators, shareholding patterns, and market data to create a comprehensive analysis for all stocks in the dataset.

Purpose

This script processes raw fundamental and technical data to generate a unified analysis dataset containing:
  • Quarterly and annual financial metrics (P&L, Sales, EPS, OPM)
  • Valuation ratios (P/E, ROE, ROCE, D/E, PEG)
  • Shareholding patterns and institutional activity
  • Technical indicators and price performance
  • Advanced indicator signals (SMA/EMA positioning, RSI, MACD)
  • Index membership and market classification

Input Files Required

fundamental_data.json
JSON
required
Core fundamental data containing quarterly and annual financial statements, ratios, and company metadata.
dhan_data_response.json
JSON
required
Technical data including price, moving averages, RSI, volume, and index membership from Dhan API.
advanced_indicator_data.json
JSON
Advanced technical indicators including SMA/EMA arrays, oscillators, pivots, and trend signals. Script runs without this but with reduced technical analysis.
nse_equity_list.csv
CSV
NSE equity listing information containing stock symbols and listing dates.

Output Produced

all_stocks_fundamental_analysis.json
JSON
Comprehensive analysis file containing enriched data for all stocks. Each stock record includes fundamental metrics, technical indicators, and derived calculations.

Processing Logic

1. Data Loading

The script loads multiple data sources and creates lookup maps:
# Load fundamental data
with open(input_file, "r") as f:
    data = json.load(f)

# Load Listing Dates from NSE CSV
listing_date_map = {}
with open(csv_path, "r") as f:
    reader = csv.DictReader(f)
    for row in reader:
        sym = row.get("SYMBOL")
        date_list = row.get(" DATE OF LISTING") or row.get("DATE OF LISTING")
        if sym and date_list:
            listing_date_map[sym] = date_list

2. Financial Metrics Extraction

The script extracts data from pipe-delimited strings representing time series:
def get_value_from_pipe_string(pipe_string, index):
    if not pipe_string:
        return 0.0
    parts = pipe_string.split('|')
    if index < len(parts):
        return get_float(parts[index])
    return 0.0

# Example: Extract quarterly net profit values
np_latest = get_value_from_pipe_string(cq.get("NET_PROFIT"), 0)
np_prev = get_value_from_pipe_string(cq.get("NET_PROFIT"), 1)
np_last_year_q = get_value_from_pipe_string(cq.get("NET_PROFIT"), 4)

3. Growth Calculations

QoQ and YoY percentage changes are calculated for key metrics:
def calculate_change(current, previous):
    if previous == 0:
        return 0.0
    return ((current - previous) / abs(previous)) * 100

# Apply to metrics
qoq_np = calculate_change(np_latest, np_prev)
yoy_np = calculate_change(np_latest, np_last_year_q)
qoq_eps = calculate_change(eps_latest, eps_prev)
yoy_eps = calculate_change(eps_latest, eps_last_year_q)

4. Ratio Calculations

Derived financial ratios:
# Debt-to-Equity Ratio
non_current_liab = get_value_from_pipe_string(bs_c.get("NON_CURRENT_LIABILITIES"), 0)
total_equity = get_value_from_pipe_string(bs_c.get("TOTAL_EQUITY"), 0)
de_ratio = non_current_liab / total_equity if total_equity != 0 else 0.0

# PEG Ratio
if yoy_eps > 0 and pe > 0:
    peg = pe / yoy_eps

# Forward P/E
if eps_latest > 0 and pe > 0:
    annualized_eps = eps_latest * 4
    ttm_eps = get_float(ttm_cy.get("EPS"))
    if annualized_eps > 0:
        forward_pe = pe * (ttm_eps / annualized_eps)

5. Shareholding Analysis

# FII & DII Quarterly Changes
fii_latest = get_value_from_pipe_string(shp.get("FII"), 0)
fii_prev = get_value_from_pipe_string(shp.get("FII"), 1)
fii_change_qoq = fii_latest - fii_prev

dii_latest = get_value_from_pipe_string(shp.get("DII"), 0)
dii_prev = get_value_from_pipe_string(shp.get("DII"), 1)
dii_change_qoq = dii_latest - dii_prev

# Free Float Calculation
promoter_latest = get_value_from_pipe_string(shp.get("PROMOTER"), 0)
free_float_pct = 100.0 - promoter_latest

# Float Shares in Crores
if mcap_cr > 0 and ltp > 0:
    total_shares_cr = mcap_cr / ltp
    float_shares_cr = total_shares_cr * (free_float_pct / 100.0)

6. Technical Indicators Processing

# Price positioning
ltp = get_float(tech.get("Ltp", 0))
high_52w = get_float(tech.get("High1Yr", 0))
if high_52w > 0 and ltp > 0:
    pct_from_52w_high = ((ltp - high_52w) / high_52w) * 100

# SMA positioning
sma_200 = get_float(tech.get("DaySMA200CurrentCandle", 0))
if sma_200 > 0 and ltp > 0:
    pct_from_sma_200 = ((ltp - sma_200) / sma_200) * 100

7. Advanced Indicators Parsing

Extracts signals from SMA/EMA arrays and technical indicators:
# SMA Signals - Calculate % Away
sma_signals = []
smas = adv_tech.get("SMA", [])
target_smas = ["20", "50", "200"]

for s in smas:
    ind_name = s.get("Indicator", "").replace("-SMA", "")
    val = get_float(s.get("Value"))
    
    if ind_name in target_smas and val > 0 and ltp > 0:
        diff = ((ltp - val) / val) * 100
        status = "Above" if diff > 0 else "Below"
        sma_signals.append(f"SMA {ind_name}: {status} ({round(diff, 1)}%)")

# Parse Oscillators/Trend
tech_inds = adv_tech.get("TechnicalIndicators", [])
sentiment_summary = []
for t in tech_inds:
    name = t.get("Indicator", "")
    action = t.get("Action", "")
    if "RSI" in name:
        sentiment_summary.append(f"RSI: {action}")
    elif "MACD" in name:
        sentiment_summary.append(f"MACD: {action}")

8. Index Membership Mapping

# Map stocks to relevant market indices
requested_indices = {13,51,38,17,18,19,20,37,1,442,443,22,5,3,444,7,14,25,27,28,447,35,41,46,44,16,43,42,45,39,466,34,32,15,33,31,30,29}
indices_found = []
idx_list_raw = tech.get("idxlist", [])
if isinstance(idx_list_raw, list):
    for idx_obj in idx_list_raw:
        idx_id = idx_obj.get("Indexid")
        idx_name = idx_obj.get("Name")
        if idx_id in requested_indices and idx_name:
            indices_found.append(idx_name)

Fields Added/Modified

This script creates the foundational dataset with the following field structure:

Basic Information

  • Symbol: Stock symbol
  • Name: Company name
  • Listing Date: Date of listing on exchange
  • Basic Industry: Industry classification
  • Sector: Sector classification
  • Market Cap(Cr.): Market capitalization in crores
  • Latest Quarter: Most recent financial quarter reported

Profitability Metrics

  • Net Profit Latest Quarter to Net Profit 3 Quarters Back: Historical net profit
  • QoQ % Net Profit Latest: Quarter-over-quarter net profit change
  • YoY % Net Profit Latest: Year-over-year net profit change

EPS Metrics

  • EPS Latest Quarter to EPS 3 Quarters Back: Quarterly EPS history
  • QoQ % EPS Latest: Quarter-over-quarter EPS growth
  • YoY % EPS Latest: Year-over-year EPS growth
  • EPS Last Year: Full-year EPS from previous year
  • EPS 2 Years Back: Full-year EPS from two years ago

Sales Metrics

  • Sales Latest Quarter to Sales 3 Quarters Back: Quarterly sales
  • QoQ % Sales Latest: Quarter-over-quarter sales growth
  • YoY % Sales Latest: Year-over-year sales growth
  • Sales Growth 5 Years(%): Compounded annual growth rate over 5 years

Margin Metrics

  • OPM Latest Quarter to OPM 3 Quarters Back: Operating profit margin history
  • QoQ % OPM Latest: Quarter-over-quarter OPM change
  • YoY % OPM Latest: Year-over-year OPM change
  • OPM TTM(%): Trailing twelve months operating profit margin

Valuation Ratios

  • ROE(%): Return on equity
  • ROCE(%): Return on capital employed
  • D/E: Debt-to-equity ratio
  • P/E: Price-to-earnings ratio
  • PEG: Price/earnings-to-growth ratio
  • Forward P/E: Forward-looking P/E based on annualized latest quarter EPS

Shareholding Pattern

  • FII % change QoQ: Foreign institutional investor holding change
  • DII % change QoQ: Domestic institutional investor holding change
  • Free Float(%): Percentage of shares available for public trading
  • Float Shares(Cr.): Free float shares in crores

Technical Indicators

  • Stock Price(₹): Current market price
  • Index: List of indices the stock belongs to
  • 1 Day Returns(%) to 1 Year Returns(%): Historical returns
  • RSI (14): 14-period Relative Strength Index
  • % from 52W High: Distance from 52-week high
  • SMA Status: Price positioning relative to SMA 20, 50, 200
  • EMA Status: Price positioning relative to EMA 20, 50, 200
  • Technical Sentiment: Aggregated signals from RSI, MACD
  • Pivot Point: Classic pivot point value
  • Gap Up %: Opening gap percentage (placeholder, updated by advanced processor)

Code Example

bulk_market_analyzer.py
import json
import csv
import os

def get_float(value_str):
    try:
        return float(value_str)
    except (ValueError, TypeError):
        return 0.0

def calculate_change(current, previous):
    if previous == 0:
        return 0.0
    return ((current - previous) / abs(previous)) * 100

def analyze_all_stocks():
    BASE_DIR = os.path.dirname(os.path.abspath(__file__))
    input_file = os.path.join(BASE_DIR, "fundamental_data.json")
    ADVANCED_FILE = os.path.join(BASE_DIR, "advanced_indicator_data.json")
    output_file = os.path.join(BASE_DIR, "all_stocks_fundamental_analysis.json")

    # Load fundamental data
    with open(input_file, "r") as f:
        data = json.load(f)
    
    # Process each stock
    analyzed_data = []
    for item in data:
        symbol = item.get("Symbol", "UNKNOWN")
        # ... extract and calculate metrics ...
        analyzed_data.append(stock_analysis)
    
    # Save results
    with open(output_file, "w") as f:
        json.dump(analyzed_data, f, indent=4)

if __name__ == "__main__":
    analyze_all_stocks()

Function Reference

get_float(value_str)

Safely converts string values to float. Parameters:
  • value_str: String or numeric value to convert
Returns: Float value or 0.0 if conversion fails

calculate_change(current, previous)

Calculates percentage change between two values. Parameters:
  • current: Current period value
  • previous: Previous period value
Returns: Percentage change rounded to 2 decimals

get_value_from_pipe_string(pipe_string, index)

Extracts value from pipe-delimited time series string. Parameters:
  • pipe_string: Pipe-delimited string (e.g., “100|95|90|85”)
  • index: Zero-based position to extract
Returns: Float value at specified index or 0.0

analyze_all_stocks()

Main processing function that orchestrates the entire analysis pipeline. Returns: None (writes output to JSON file)

Performance Notes

  • Processes ~2,000 stocks in approximately 5-10 seconds
  • Memory usage scales with dataset size (typically 200-500 MB)
  • No parallelization implemented; runs sequentially
  • Can handle missing data sources gracefully with warnings

Dependencies

  • json: Standard library for JSON processing
  • csv: Standard library for CSV parsing
  • os: File path operations

Source File Location

bulk_market_analyzer.py:1-360